GBE 



Alternative Processing as Evolutionary Mechanism for the 
Origin of Novel Nonprotein Coding RNAs 

Dingding Mo 1 , Carsten A. Raabe 1 , Richard Reinhardt 2 , Juergen Brosius 1 '*, and Timofey S. Rozhdestvensky 1 '* 

institute of Experimental Pathology, ZMBE, University of Muenster, Muenster, Germany 
2 Max Planck Genome Centre Cologne, Cologne, Germany 

Corresponding author: E-mail: rozhdest@uni-muenster.de, RNA.world@uni-muenster.de. 
Accepted: October 3, 2013 

Abstract 

The evolution of new genes can ensue through either gene duplication and the neofunctionalization of one of the copies or the 
formation of a de novo gene from hitherto nonfunctional, neutrally evolving intergenic or intronic genomic sequences. Only very 
rarely are entire genes created de novo. Mostly, nonfunctional sequences are coopted as novel parts of existing genes, such as in the 
process of exonization whereby introns become exons through changes in splicing. Here, we report a case in which a novel nonprotein 
coding RNA evolved by intron-sequence recruitment into its structure. cDNAs derived from rat brain small RNAs, revealed a novel small 
nucleolar RNA (snoRNA) originating from one of the Snord 1 1 5 copies in the rat Prader-Willi syndrome locus. We suggest that a single- 
point substitution in the Snord 1 1 5 region led to the expression of a longer snoRNA variant, designated as L-Snord 1 1 5. Cell culture and 
footprinting experiments confirmed that a single nucleotide substitution at Snordl 1 5 position 67 destabilized the kink-turn motif 
within the canonical snoRNA, while distal intronic sequences provided an alternate D-box region. The exapted sequence displays 
putative base pairing to 28S rRNA and mRNA targets. 

Key words: evolution of novel nonprotein coding RNA variants, Prader-Willi syndrome (PWS), rat Snordl 15, processing 
mutant; snoRNA biogenesis, K-turn motif. 



Introduction 

Usually, novel genes are not generated de novo but evolve by 
duplication of existing genes and, if not inactivated and decay- 
ing as so-called pseudogenes, copies might change in a more 
or less gradual manner (Ohno 1 970). A frequent mechanism of 
amplification is segmental duplication of one or several genes 
in a locus by unequal crossing over. A rare extreme is whole 
chromosome or even whole genome duplication. A different 
route of duplication is restricted to single genes and occurs via 
RNA intermediates by the mechanism of retroposition. This 
involves conversion of usually mature RNAs, for example, 
mRNA into cDNA accompanied by more or less random inte- 
gration into the genome. Most often, this mechanism yields 
inactive retropseudogenes; for example, because of the lack of 
promoter elements necessary for expression. Should such reg- 
ulatory elements fortuitously be coopted at the genomic locus 
of integration, a functional retrogene might evolve (Brosius 
1991). True de novo formation of a gene out of hitherto neu- 
trally evolving DNA is considered to be rare (Levine et al. 2006), 
but de novo evolution might be more frequent as expected 
(Neme and Tautz 2013). Interestingly, a mechanism termed 



overprinting can generate novel protein products out of min- 
imally altered preexisting genes, simply by shifting the open 
reading frame (Keese and Gibbs 1992). More common is the 
recruitment (exaptation) of novel modules to existing genes, 
such as exonization of intronic sequences (Lev-Maor et al. 
2003), as predicted by Gilbert (1978). Generally, at the 
onset, such exons are alternatively spliced only, yielding low 
amounts of the novel mature mRNA in addition to the original 
mRNA. Furthermore, as the alternative exon is usually slightly 
deleterious, neutral or at best slightly advantageous, persis- 
tence over tens or even hundreds of millions years is the 
exception rather than the rule (Krull et al. 2007). Functional 
nonprotein coding RNAs also arise by gene duplication includ- 
ing retroposition, as is the case for small nucleolar RNAs 
(snoRNAs) (Brosius 2003; Vitali et al. 2003; Weber 2006; 
Zemann et al. 2006; Schmitz et al. 2008). Furthermore, neu- 
ronal BC1 RNA arose in the common ancestor of rodents by 
retroposition of a mature tRNA Ala . The fortuitous location of a 
distal RNA polymerase III transcription terminator provided an 
additional 75 nt to contribute the 3' domain of BC1 RNA 
(DeChiara and Brosius 1987; Kim et al. 1994). 
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The snoRNAs constitute a large family of small nonprotein 
coding RNAs in eukaryotes and Archaea. The majority of 
snoRNAs, in complex with proteins as ribonucleoprotein parti- 
cles (RNPs, snoRNPs), are involved in posttranscriptional pro- 
cessing and maturation of RNAs. Except for U3, U8, U14, U17, 
and U22 snoRNAs that have been proposed to function as RNA 
chaperones to regulate preribosomal RNA (pre-rRNA) folding 
and mediate correct nucleolytic processing (maturation) (Kiss 
2004), most of the remainder direct site-specific posttranscrip- 
tional modifications on 18 S, 28 S, 5.8 S rRNAs, and some U 
spliceosomal small nuclear RNAs (snRNAs) (Dragon et al. 2006; 
Gagnon et al. 2007; Died et al. 2009). In addition, for a smaller 
subset of snoRNAs that exhibit base complementarity to pre- 
rRNAs but do not guide endonucleolytic cleavages or nucleo- 
tide modifications, a chaperone-like function was also sug- 
gested but not experimentally validated (Vitali et al. 2003). 
For most of the known snoRNAs, the transient interaction 
with complementary regions in RNA targets mediates function. 

Based on conserved sequence and structural motifs, 
snoRNAs are divided into two subclasses, the C/D box 
snoRNAs and the H/ACA box snoRNAs, respectively (Kiss 
et al. 2006). The majority of C/D-box snoRNAs guide 2'-0- 
methylation of RNA ribose moieties, while H/ACA-box 
snoRNAs are involved in the isomerization of uridine to pseu- 
douridine. The 2'-0-methylation guide snoRNAs harbor con- 
served C (RUGAUGA consensus) and D (CUGA) box motifs, 
located near to the 5'- and 3'-ends of the RNA, respectively 
(many snoRNAs also contain internal copies of these elements 
that are termed C and D' boxes). An interaction between 
snoRNA-termini results in the formation of a stem structure, 
whereas C and D-boxes are involved in kink-turn (K-turn) 
motif assembly that is recognized by the 15.5kDa protein in 
vertebrates (homolog of Snu13p in yeast and L7Ae in 
Archaea) (Watkins et al. 2000; Kuhn et al. 2002). Three fur- 
ther proteins, fibrillarin (a methyltransferase), Nop56p, and 
Nop58p, participate in the canonical core C/D box snoRNP 
assembly (Kiss et al. 2006). 

In vertebrate genomes, most snoRNAs are encoded in in- 
trons of either protein coding or nonprotein coding host genes 
(Dieci et al. 2009). Together with exons, they are transcribed by 
RNA polymerase II as hnRNA. Biogenesis of C/D box snoRNAs is 
a complex process that involves posttranscriptional snoRNP as- 
sembly coupled with nucleolytic processing of host gene pre- 
RNA introns and intranuclear trafficking (Filipowicz and Pogacic 
2002). In mammals, the majority of C/D box RNAs maps to 
intronic regions located -70-80 nt upstream from the acceptor 
splice site. They are processed in a splicing-dependent manner, 
involving general splicing factors (Hirose and Steitz 2001; 
Hirose et al. 2003, 2006). The remaining C/D box snoRNAs, 
including those in the repetitive clusters on human chromo- 
some 14 and 1 5, are located more distantly from the acceptor 
splice site and considered not to interact directly with general 
pre-RNA splicing factors during processing (Cavaille et al. 2000, 
2002; Hirose et al. 2003). However, in both cases to prevent 



snoRNA degradation during posttranscriptional processing, 
binding of the core-snoRNP proteins is essential (Richard and 
Kiss 2006). The 15.5kDa protein recognizes the terminal K- 
turn motif formed between C and D-boxes of pre-snoRNA 
and provides the scaffold for the other core-protein compo- 
nents to bind (Watkins et al. 2000; Cahill et al. 2002; Watkins 
et al. 2002; Kiss et al. 2006). The assembled pre-snoRNPs un- 
dergo 5'- and 3'-RNA exonucleolytic trimming, resulting in 
mature snoRNA-protein complexes that are transported to 
the nucleolus (Tycowski et al. 1993; Kiss and Filipowicz 1995; 
Cavaille and Bachellerie 1 996; Watkins et al. 1 996). 

In humans, the Prader-Willi syndrome (PWS) is a neuroge- 
netic disorder caused by deletion or inactivation of imprinted 
genes within the PWS locus on paternally inherited chromo- 
some 1 5. Apart from several protein-coding genes, this locus 
harbors two large tandemly repeated clusters of C/D box 
snoRNAs: SNORD1 16 and SNORD1 15, with 24 and 47 gene 
copies, respectively, generated from introns of the U-UBE3A- 
AS long nonprotein coding RNA (the typical arrangement is 
one SNORD gene copy per intron; a few introns harbor two 
copies of SNORD 1 16 genes) (Cavaille et al. 2001; Wirth et al. 
2001; Yin et al. 2012). Although deletion of the SNORD1 16 
gene cluster resembles key characteristics of the PWS-pheno- 
type in patients and causes growth retardation in mice 
(Skryabin et al. 2007; Ding et al. 2008; Sahoo et al. 2008; 
de Smith et al. 2009), SNORD1 15 deletion appears to lack a 
phenotype (Runte et al. 2005). Snord115 and Snord116 
belong to a subclass termed "orphan" snoRNAs as they lack 
apparent base complementarities to common RNA targets, 
suggesting functions apart from rRNA and snRNA processing 
(Bachellerie et al. 2002). 

On the basis of analysis of small nonprotein coding RNA 
enriched cDNA libraries from rat brain, we uncovered a novel 
snoRNA derived from the imprinted PWS locus. Our data sug- 
gest that this RNA arose by a single nucleotide substitution in 
an ancestral Snordl 15 gene copy. Experimental analysis indi- 
cated that the nucleotide exchange lead to destabilization of 
the original Snordl 15 K-turn motif. In addition, the down- 
stream intron provided an alternative D-box motif and 
sequences that enable alternate K-turn formation. These 
structural alterations trigger alternative pathways of snoRNA 
maturation and lead to the utilization of Snordl 1 5 3'-flanking 
sequence resulting in a novel snoRNA variant. Our experimen- 
tal data reveal additional mechanistic insight into nonprotein 
coding RNA origin and evolution. 

Materials and Methods 

Generation of Recombinant Plasmid Constructs 

The L-Snord115 expression constructs were generated via 
polymerase chain reaction (PCR) amplification of rat genomic 
templates. The DNA extraction was conducted according the 
standard proteinase K method (Maniatis et al. 1989). The 
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design of PCR primers to amplify the Snordl 15 gene flanked 
by G2 and G1 exons was based on available rat cDNAs 
(CB616315) (Landers et al. 2004). The resulting PCR products 
were cloned into pDrive vectors using the QIAGEN PCR clon- 
ing kit according to the manufacturer's recommendations. 
Inserts were verified by sequencing and subcloned into the 
pcDNA3.1+ vector (Invitrogen) via SamHI and Hind\\\ restric- 
tion sites. Mutations in the L-Snord115 or Snordl 15 coding 
regions were introduced by 5' overlapping PCR (Warrens et al. 
1997). The resulting PCR fragments were cloned into 
pcDNA3.1+ vector and verified by sequencing. All recombi- 
nant plasmids were purified by cesium chloride (CsCI) gradient 
centrifugation (Maniatis et al. 1989). 

Cell Culture and DNA Transfection 

HeLa cells were cultured in Dulbecco's modified Eagle 
medium (DMEM) (Sigma) medium, supplemented with 10% 
fetal bovine serum (BioWest), 10mM sodium pyruvate 
(LifeTech), 100U/ml penicillin (LifeTech), and 100U/ml strep- 
tomycin (LifeTech) at 37 °C in 5% (v/v) C0 2 . Transient trans- 
fections were performed with the lipotransfection reagent 
(Lipofactamine-2000, Invitrogen) according the manufac- 
turer's recommendations, at 70-85% cell confluence in six- 
well plates using 2 u.g of plasmid DNA per transfection. 
Lipofectamine-2000-DNA complexes were formed for 
20min at room temperature in OptiMEM (LifeTech) buffer. 
The complexes were transferred to HeLa cells and incubated 
for 6h at 37 °C. Subsequently, the OptiMEM buffer was 
replaced by DMEM medium (see earlier), and total RNA was 
extracted 24-36 h posttransfection. 

Total RNA Extraction and Northern Blot Hybridization 

Total RNA from HeLa cells and rat tissues was extracted using 
the TRIzol reagent (Invitrogen) according to the manufac- 
turer's recommendations. Approximately 6|ig of total RNA 
per sample was size fractionated on 8% (w/v) polyacrylamide 
(29:1 acrylamide/bis), 7M urea gels and electrotransferred to 
positively charged nylon membranes (BrightStar-Plus, 
Ambion, Bio-Rad). Before hybridization, the RNA was UV- 
crosslinked to membranes (Stratalinker UV Crosslinker 2400, 
Stratagene). Membranes were prehybridized in 20 ml of 0.5 M 
sodium phosphate (pH 6.5 at 58 °C), 7% (w/v) sodium dode- 
cyl sulfate (SDS) buffer at 56 °C for 40min. Subsequently, 
northern blot hybridizations were performed with 50pmol 
of 5'- 32 P labeled oligonucleotides (fig. ~\A) in prehybridization 
buffer at 56 °C overnight. Membranes were washed three 
times in 0.1 M sodium phosphate (pH 6.5), 1% (w/v) SDS 
buffer for 3min each at 46 °C and BioMax MS films (Kodak) 
were exposed at -80 °C overnight. 

In vitro Transcription of Different RNA Templates 

L-Snord1 1 5 RNA and mutants were in vitro transcribed by T7 
RNA polymerase (Fermentas). The corresponding runoff 



templates for in vitro transcription were generated by PCR 
amplification. The PCR forward primer included sequences 
of the T7 RNA polymerase promoter (supplementary table 
S1, Supplementary Material online). In vitro transcribed frag- 
ments of 28 S rRNA were generated in a similar way using 
human genomic DNA isolated from placenta as a template for 
PCR reactions (supplementary table S1, Supplementary 
Material online). In vitro transcription was conducted in 
1 00 total reaction volume, supplemented with 40 mM 
Tris-HCI pH 7.9, 6mM MgCI 2 , 10mM dithiothreitol (DTT), 
1 0 mM NaCI, and 2 mM spermidine. Template concentrations 
ranged between -0.5 and 1 ^g per reaction. Each reaction 
was performed with 2.5 mM NTPs, 40 U RNase inhibitor (Fer- 
mentas), and 2,000 U of T7 RNA polymerase (Fermentas). 
Transcription proceeded for 2h at 37 °C. The synthesized 
RNAs were separated on 8% (w/v) polyacrylamide (29:1 ac- 
rylamide/bis) 7 M urea gels and eluted in 0.3 M NaOAc (pH 
5.2) buffer at 4°C overnight. Subsequently, RNAs were EtOH 
precipitated and dissolved in ddH 2 0. 

Lead (ll)-Footprinting Analysis 

L7Ae protein was purified as described previously (Rozhdest- 
vensky et al. 2003). In vitro transcribed RNAs were depho- 
sphorylated by Antarctic Phosphatase treatment (New 
England BioLabs). The resulting RNAs were subjected to T4 
polynucleotide kinase (T4 PNK) (New England BioLabs) treat- 
ment to incorporate [y- 32 P]-ATP (Perkin Elmer) label at 5'- 
ends. Lead acetate cleavage was performed with minor mod- 
ifications (Youssef et al. 2007). In brief, 5'- 32 P labeled RNAs 
were heat-denatured at 90°C for 1 min and immediately 
chilled on ice for at least 2 min. RNA-L7Ae complex formation 
was performed in 20 mM 4-(2-hydroxyethyl)-1-piperazi- 
neethanesulfonic acid (HEPES)-KOH (pH 7.0), 200 mM potas- 
sium acetate, 1.5mM magnesium acetate, 2.5ug/ul tRNA, 
and 10 U RNase inhibitor (Fermentas); specific concentrations 
of L7Ae protein are indicated in figure 3A Footprinting anal- 
yses were performed with freshly prepared 1 5 mM lead ace- 
tate at room temperature. Cleavage was terminated after 
10 min by ethylenediaminetetraacetic acid (EDTA) addition. 
All reactions were ethanol precipitated and separated on 
8% (w/v) polyacrylamide (38:2 acrylamide/bis] 7 M urea gels. 
RNase T1 and alkaline hydrolysis RNA ladders were generated 
according the manufacturer's instructions (Ambion). MS films 
(Kodak) were exposed to the gels overnight at -80 °C. 

Mapping of 2'-0-Methylation on 28 S rRNA 

Mapping of possible 2'-0-methylation sites on 28 S rRNA was 
done by reverse transcription (RT) assays at low deoxyribonu- 
cleotide triphosphate (dNTP) concentration (Maden 2001). 
Briefly, -0.5 ug total RNA, isolated from transiently trans- 
fected HeLa cells (see Results and Discussion), was mixed 
with 0.5pmol of 5'- 32 P-labeled oligonucleotide primer for 
RT. The mixtures were denatured at 85 °C for 2 min and 
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Northern BW Oligo NBRB52S&F 3" TAC TGTATTTTTAGTAC G AGTTATCC TAATG 5' 
C box D' box C* Boj^^^^^^^^^^^^^^^^^^^^___ D box 

(TS «) 5" GGTCAATG ATG AC AACA TT AAG TC AAGAACAGAATG ATG ACATAAAAATCATGCTC AATAG G ATTACGCTG AGG (CC I 3' 

2 f" nt) 5 < GTCAATGATGACAACAT7AAGTCAAGAACAGAATGATGACATAAAAATCATGCTCAATAGGATTACGCTGAGG (CC I 3 1 

3 (Tent) 5 GGTCAATG ATG AC AACAGAAAGTC AAGAACAGAATG ATG AC ATAAAAAT CAT GCTCAATAGGATTACGCTG AGG (CC I 3' 

4 <7Snt> s' GTCAATGATG AC AACAGAAAGTC AAG AAC AG AATGATGACATAAAAATCATGCTC AATAG GATTACGCTGAGGICC I 3' 

5 MM nt) 5' GTCAATG ATGACAAC AGAAAGTCAAGAAC AG AATGATGACATAAAAAT C ATGCTCAATAGGATT A CQC TGAGGCCCAACCAGG GA GCCAGG GTAAC AAGCACTACTTAGT CTCTTTGAGGACC AC TTGCGGG ACT C ATCTGAQCTGCTCTOATG (CC I 3' 
C box D'box C'box D bo* 1 J J D box 2 DboxS 

Northern Blot Oligo NBLRB52M 3' GG G AC TCCGGGTTGGTC C C T 5' Northern Blot Oligo NBRB52L3' 3' GAG AAAC TC C TGGTG AA C GCCC TG 5' 
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Fig. 1. — Identification and expression analysis of L-Snord1 1 5 RNA. (A). Sequence alignment between known rat Snordl 1 5 variants (1-4) and the novel 
L-Snord1 1 5 RNA (5). snoRNAs sizes in nucleotides are indicated on the left in parentheses. Putative C, D, C, and D' boxes are in bold letters and, in addition, 
designated above and below the alignment. snoRNA regions complementary to the northern blot probes are underlined. Hybridization probes and their 
sequences are also indicated. (6 and 0 northern blot analysis with NBRB52S&F (6) and NBLRB52M (0 probes to examine L-Snord1 1 5 expression in different 
rat tissues indicated above the blot lanes. L-Snord1 1 5, Snordl 1 5 RNAs and their estimated sizes (in nt) are indicated on the margins. As a loading control, a 
negative image of an ethidium bromide stained 5.8 S rRNA signals is shown at the bottom. 



allowed to anneal at room temperature. RT was performed in 
20|il reaction volume, containing 50 mM Tris-HCI (pH 8.5), 
30 mM KCI, 8mM MgCI 2 , 20 U RNase inhibitor (Fermentas), 
and 2.5 U of transcriptor reverse transcriptase (Roche). 
Different concentrations of dNTP (10u,M, 100u.M, and 
1 mM) (Roche) were added to individual reactions. RT was 
performed at 55 °C for 40min (supplementary fig. S1, 
Supplementary Material online) and terminated by the addi- 
tion of 2 x RNA loading dye (Ambion). Aliquots were sepa- 
rated on 8% (w/v) polyacrylamide (38:2 acrylamide/bis) 7 M 
urea gels. To monitor potential stops of RT caused by RNA 
secondary structures, RT reactions conducted with the identi- 
cal primer and 0.02 pmol of in vitro transcribed 28 S rRNA 
fragments served as control (fig. 3/4). MS films (Kodak) were 
exposed to resulting gels overnight at -80 °C. 

Results and Discussion 

Identification of Novel snoRNA from the Rat PWS-Locus 

On the basis of specialized rat brain cDNA libraries enriched for 
small nonprotein coding RNAs (Raabe CA, Brosius J and 
Rozhdestvensky TS, unpublished data), we identified numer- 
ous isoforms of Snordl 15 snoRNA (supplementary fig. S2, 
Supplementary Material online). The library design favored 
full-length cDNAs because the synthesis relied on RNA 5'- 
and 3'-end modifications by adapter ligation and C-tailing, re- 
spectively (Raabe et al. 2010). We identified a novel nonprotein 
coding RNA candidate, almost twice as long as the 
previously known Snordl 15 RNAs (fig. 1/4), designated as 
long-Snord1 15 (L-Snord1 15). The RNA, represented by 26 
cDNAs, is 156nt long and maps to the rat Snordl 15 gene 
cluster. L-Snord1 1 5 displays sequence identity to an annotated 



rat canonical Snordl 15 isoform (Ensembl Transcript: 
ENSRNOT00000052941) throughout its entire 5'-domain, 
except for a single G to C nucleotide substitution located adja- 
centtothe D-boxof the known snoRNA (fig. 1/4). Close inspec- 
tion of the 3'-region of L-Snord115 RNA revealed two 
additional putative D-box elements (CUGA-sequences) located 
14 and 4 nt upstream from the RNA 3'-end, respectively (fig. 
1/4). Therefore, sequence and structural analysis suggests that 
the identified nonprotein coding RNA candidate is a potentially 
novel C/D-box snoRNA. For further validation and to establish 
the expression profile of L-Snord1 1 5 RNA across different tis- 
sues, northern blot hybridization on total RNA isolated from rat 
brain, heart, kidney, liver, and lungs was carried out (fig. 16 and 
O- Three specific oligonucleotide probes complementary to the 
5'-, central, and 3'-regions of L-Snord1 1 5 RNA were designed 
(fig. 1/4). We could detect brain-specific expression of the novel 
snoRNA candidate (fig. 16 and Cand data not shown) paral- 
leling the expression profile of canonical Snordl 1 5 snoRNAs. 
The oligonucleotide probe (NBRB52S&F) corresponding to the 
5'-domain of L-Snord115 detected the canonical Snordl 15 
isoforms as well as L-Snord1 1 5 RNA (fig. 16). Probes comple- 
mentary to the central and 3'-portion of the novel snoRNA 
isoform (NBRB52M and NBRB52L3', respectively) identified 
brain-specific signals of ~160nt, indicative of L-Snord115 
RNA (fig. 1Cand data not shown). Genomic analysis revealed 
that the 5'-region of L-Snord1 1 5 RNA only differed by a single 
nucleotide substitution from the rat-annotated Snordl 1 5. We 
therefore investigated the potential impact of single nucleotide 
transversion on the biogenesis of a novel snoRNA variant. 

Notably, Blast (http://blast.ncbi.nlm.nih.gov/Blast.cgi, last 
accessed October 28, 2013) and Blat (http://genome.ucsc. 
edu/cgi-bin/hgBlat?command=start, last accessed October 
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28, 2013) rat genome searches did not reveal a perfect match 
for the L-Snord1 1 5 query sequence as well as its shorter form. 
The best Blat hit displayed only 99.4% sequence identity to 
the available annotated sequence of the rat Snordl 15 locus, 
with a single C -> G transversion adjacent to the D-box. 

Because of the repetitive structure of the snoRNA clusters 
within PWS loci in mammals, the genomic assembly of this 
region often remains incomplete (Nahkuri et al. 2008). We 
therefore PCR amplified the rat PWS genomic region, pre- 
sumed to contain the novel variant, with primers located in 
the flanking G1 and G2 exons. The resulting PCR products 
were inserted into the pDrive vector and subsequently 
sequenced to verify the L-Snord1 1 5 containing insert. 

To investigate the unusual processing pattern of 
L-Snord115, we subcloned the corresponding construct into 
the pcDNA 3.1 eukaryotic expression vector (fig. 24, sequence 
1) and performed transient transfection experiments in HeLa 
cells. Total RNA was examined by northern blot hybridization 
using NBRB52S&F probe from the 5'-region of the RNA. In 
HeLa cells, human SNORD1 1 5 was not detected, as in this cell- 
type the endogenous genes are silent (fig. 2B, vector lane). In 
transfected HeLa cells two signals corresponding to Snordl 1 5 
and L-Snord115 RNAs were observed (fig. 26, lane 1). 
Presumably, alternative posttranscriptional processing gener- 
ates both snoRNAs from a single gene repeat unit. 

Analysis of Posttranscriptional Processing of Snordl 15 

Canonical C/D box snoRNAs including all Snordl 15 isoforms 
contain consensus C and D boxes at their 5'- and 3'-ends, 
respectively (Cavaille et al. 2000; Bachellerie et al. 2002; 
Nahkuri et al. 2008). These elements are part of the K-turn 
structure motif that is a hallmark of eukaryal and archaeal 
snoRNAs of this type (Kiss et al. 2006; Gagnon et al. 2007; 
Dieci et al. 2009). The double-stranded K-turn motif resembles 
a variation of the "helix, internal-loop, helix" type of RNA 
secondary structures. Typically, it contains a 5'-canonical 
stem forming the base of the motif that is followed by a 3 
nt long asymmetric bulge and ending in A»G and G»A 
sheared base pairs that constitute the noncanonical stem for- 
mation (Klein et al. 2001). The phosphodiester backbone of 
the bulge nucleotides forms a sharp turn (kink) of -120° in 
the helix toward the minor groove. The K-turn conformation is 
stabilized by base stacking interactions between adenosines of 
the A»G base pairs and the canonical stem (Klein et al. 2001). 
In addition, bases located in the bulge also participate in stack- 
ing interactions with nucleotides of the canonical and nonca- 
nonical stems and further contribute to stabilize the motif (Lin 
et al. 201 1). To form K turns, the RNA requires interaction of 
specific proteins or/and metal ions (Matsumura et al. 2003; 
Goody et al. 2004; Turner and Lilley 2008). K-turn motifs in 
mammalian C/D box snoRNAs are associated with the 
15.5kDa protein, which is also required for the assembly of 
core-snoRNP (Watkins et al. 2000). Deletions or mutations 



within the terminal C- or D-boxes disturb K-turn motif forma- 
tion and protein binding and therefore will result in snoRNA 
exonucleolytic degradation during processing (Darzacq and 
Kiss 2000; Filipowicz and Pogacic 2002). 

To gain further insight into mechanistic details of 
L-Snord115 processing, we generated and analyzed snoRNA 
mutants. We postulated that processing of L-Snord115 is 
dependent on the formation of a K-turn motif involving the 
5'-C-box and the potential D-box2 or D-box3 of L-Snord1 15 
RNA (figs. 14 and 2/4). We designed experiments to identify 
the functional 3'-terminal D-box of L-Snord115. Two con- 
structs were generated harboring mutations in the D-box2 
or D-box3, respectively (fig. 24-C, sequences 6 and 7). 
Mutation of D-box2 did not interfere with L-Snord1 15 post- 
transcriptional processing and stability (fig IB, lane 6). In con- 
trast, a GpA to CpU (positions 151 and 152) substitution 
within D-box3 completely abolished the expression of the 
long L-Snord1 15 form but not of the shorter canonical form 
(fig. 2B, lane 7; fig. 2Q. Expectedly, mutation of both boxes 
abolished L-Snord1 15 expression (fig. IB, lane 8). To further 
investigate the importance of D-box3, we stabilized the 5'- 
canonical stem by replacing cytosine at position 155 with 
adenosine (fig. 2Q. The corresponding base exchange 
would generate an additional U-A Watson-Crick base pair 
instead of a U-C mismatch at the base of the canonical 
stem in the K-turn motif (fig. 24, construct 2; fig. 2Q. 
Accordingly, northern blot analysis shows an increase of 
L-Snord115 accumulation. This observation further supports 
the involvement of D-box3 in formation of the K-turn motif 
within the longer snoRNA structure. Stabilization of this motif 
resulted in a shift of the snoRNA processing equilibrium to- 
ward L-Snord1 15 (fig. IB, lane 2). To further analyze the post- 
transcriptional processing equilibrium between canonical 
Snordl 1 5 and L-Snord1 1 5 RNA, we mutated D-box1 to abol- 
ish the competition with D-box3 for K-turn motif formation 
(fig. 24, construct 5; fig. 2D). In transient transfections con- 
ducted with the D-box1 mutant, we detected only 
L-Snord1 15 RNA in northern blots (fig. 24, B lane 5; fig. 2D). 

All cDNAs representing L-Snord115 RNA detected in our 
cDNA library screens contain cytosine adjacent to D-box1 at 
position C67. Most of the known mammalian Snordl 15 iso- 
forms harbor guanine (G67) at this position (Nahkuri et al. 
2008). Therefore, we investigated the potential influence of 
G67 to C67 substitution on L-Snord115 RNA maturation. A 
construct containing the corresponding rat genomic sequence 
with G67 instead of C67 adjacent to the D-box1 (fig. 24, 
construct 3) only yielded Snordl 15 RNA (fig. IB, lane 3). 
Secondary structure analysis of the potential K-turn motif 
formed between C-box and D-box1 of Snordl 15 suggested 
that G67 might stabilize the noncanonical stem of the K-turn 
by base pairing with the C 1 2 nucleotide (fig. 2D). The increase 
in stability is likely to explain the exclusive generation of 
Snordl 1 5 RNA in transient transfections. To examine whether 
structural stabilization of the alternative K-turn motif formed 
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Fig. 2. — Analyses of L-Snord115 posttranscriptional processing. (A) Schematic representation of expression constructs containing the genomic 
L-Snord1 1 5/Snord1 1 5 repeat unit harboring the snoRNA and relevant sequences (1, 3) or snoRNA mutants (2, 4-8) used in expression studies. 
Nucleotides representing putative snoRNA boxes are in bold letters. The G67 to C67 substitution and C1 55 to A1 55 mutation leading to K-turn stabilization 
are highlighted in red. Mutations in putative D-boxes are in blue lettering. (6) Northern blot analysis of total RNA isolated from transfected HeLa cells. 
Transfected pcDNA 3.1 control vector or constructs 1-8 (as represented in A) are indicated above the respective lanes. Arrows indicate signals of snoRNAs 
and 5.8 S rRNA(as loading control, a negative image of an ethidium bromide stained gel is shown at the bottom). (Cand D). Putative secondary structures for 
terminal K-turn motifs in L-Snordl 1 5 (0 and Snord! 1 5 RNA (D). Nucleotide substitutions (mutations) are indicated as in (A). 
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Fig. 3. — Structural analysis of snoRNA/L7Ae RNP complexes with lead acetate. (A). In vitro transcribed 5'- P labeled RNAs (indicated on top) were 
incubated with increasing concentrations of L7Ae protein (0, 0.3, and 0.6 |xM) and treated with 1 5 \iM of lead acetate (+ lanes). As control, the corre- 
sponding RNA incubated with 0.6 \iM of L7Ae protein without lead cleavage was loaded (-lanes). To determine RNA cleavage sizes alkaline and RNAse T1 
digestions of L-Snord1 15 RNA were included (indicated as OH or T1 ladders, respectively). Positions of 3'-G residues of RNase T1 cleavage products are 
indicated on the left. On the right, the regions of putative snoRNA boxes are designated. (6 and 0 Structural models suggested for posttranscriptional 
processing of snoRNA resulting in Snord! 1 5 (6) or L-Snordl 1 5 (0 RNAs, respectively. 



between the C-box and D-box3 results in generation of 
L-Snord115 RNA, we introduced a C155 to A155 base 
change into the G67 containing construct (fig. 2A, sequence 
4). In HeLa cell transfections, the short Snord115 RNA was 
detected only, indicating that G67 is sufficient to shift 
the processing equilibrium to the canonical Snordl 1 5 variant 
(fig. 26, lane 4). In conclusion, our data indicate that the 
single G to C nucleotide substitution in one of the snoRNA 
gene copies permits fortuitous recruitment of an external 
D-box in an appropriate sequence context, located in the 
3'-flanking intron. This recruitment results in alternative 



K-turn motif formation and permits L-Snord115 RNA 
generation. 

Probing of Putative RNA Structural Conformations 

For further analysis of structural elements underlying 
Snordl 15 posttranscriptional processing, we performed lead 
acetate cleavage experiments. Lead ions usually catalyze phos- 
phodiester bond cleavage within unstructured single stranded 
(bulges, loops, etc) or flexible RNA regions (Huntzinger et al. 
2008). Protein(s) interacting with RNA might protect the 
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phosphodiester backbone from cleavage (Huntzinger et al. 
2008). As mentioned earlier, K-turn formation requires speci- 
fic protein(s) to interact with RNA. Archaeal ribosomal protein 
L7Ae is a functional homolog of mammalian C/D box snoRNA 
15,5kDa protein (Kuhn et al. 2002). It has been reported to 
specifically recognize and stabilize different K-turn structural 
motifs (Rozhdestvensky et al. 2003). Therefore, we used re- 
combinant L7Ae protein as structural component to form and 
protect K-turns within L-Snord115 and the corresponding 
RNA mutants. In addition to L-Snord115 (fig. 2A, construct 
1), other in vitro transcribed RNAs were examined: 1) The 
L-Snord115 C67 to G67 substitution transcript only yielding 
canonical Snord115 during cell culture transfection 
(fig. 2A, construct 3; fig. 26, lane 3); 2) L-Snord115 C155 
to A155 mutant that resulted in increased generation of 
L-Snord1 1 5 RNA (fig. 2A, construct 2; fig. IB, lane 2); and 
3) L-Snord115 D-box1 mutant that exclusively generated 
L-Snord1 15 snoRNA (fig. 2A, construct 5; fig. IB, lane 5). 

When comparing the lead cleavage results of the investi- 
gated L7Ae RNP complexes, the most obvious differences 
were observed within the D-box1 region of snoRNAs 
(fig. 3/4). We were unable to resolve the D-box3 RNA region 
because of its proximity to the 3'-end of the RNA. In 
L-Snord115 containing the C67 to G67 substitution, the 
D-box1 region appeared completely protected from cleavage. 
This indicated that these nucleotides are involved in K-turn 
motif formation and bound to L7Ae protein in the majority 
of RNA molecules (fig. 3A-Q. Hence, the presence of G67 
nucleotide in L-Snord1 15 sequence favors an RNA structure 
where the D-box1 forms a K-turn motif with the 5'-terminal 
C-box region (fig. 3A and B). In experiments with the L7Ae/L- 
Snordl 1 5 (C67) RNP-complex, we detected slight RNA back- 
bone cleavage in the D-box1 sequence (fig. 3A). In agreement 
with our transfection experiments, at least two major RNA 
structural conformations resulted from competition between 
D-box1 and D-box3 for the 5' C-box sequence in both cases, 
leading to a K-turn motif (fig. 3A-Q. Examining the 
L-Snord115 C155 to A155 mutant, we observed a slight in- 
crease in cleavage within the D-box1 region. This supported 
our previous observation that the C155 to A155 mutation 
stabilized the K-turn motif formation between C-box and 
D-box3 (fig. 26 and Q and is consistent with the accumulation 
of L-Snord1 15 as observed in northern blots (fig. 26). 

Finally, lead acetate footprinting with L-Snord115 harbor- 
ing the GpA to CpU substitution in D-box1 showed strong 
cleavage in the mutated region (fig. 3A and Q correlating with 
our transfection studies, where only L-Snord115 RNA was 
generated from the construct when the D-box1 motif was 
deactivated (fig. 26). Notably, D', C, and C-boxes were com- 
pletely protected from lead cleavage in all tested RNPs, indi- 
cating that in the investigated RNAs those nucleotides were 
involved in K-turn formation and therefore bound to L7Ae. In 
summary, the G67 to C67 nucleotide substitution destabilized 
the canonical Snord115 terminal K-turn motif during 



RNA processing and led to an additional RNA structure that 
allowed for L-Snord1 1 5 maturation during posttranscriptional 
processing. 

Does the L-Snord1 1 5 RNA Variant Have a Function? 

The majority of C/D-box snoRNAs exhibit complementary 
to rRNAs or snRNAs guiding posttranscriptional modification 
of their targets. We therefore performed computational 
analysis using a modified DNAMAN (version 6.015) software 
(Zemann et al. 2006) to screen for putative antisense ele- 
ments located within the 3'-region of L-Snord115 that 
could potentially target rRNA or snRNA molecules. We iden- 
tified an 8nt sequence element adjacent to D-box3 of 
L-Snord115 that exhibits base complementarity to an evolu- 
tionary conserved region of 28 S rRNA. The analysis suggested 
that L-Snord115 might guide 2'-0-methylation of rat 28 S 
rRNA at G4737 corresponding to G4980 in human rRNA 
(fig 4/4). In mammalian rRNA, this nucleotide modification 
has not been reported (Lestrade and Weber 2006). 
Therefore, we experimentally analyzed the potential involve- 
ment of L-Snord115 in methylation of endogenous human 
(during HeLa transfection experiments) and rat brain 28 S 
rRNAs, respectively. The biochemical analysis to verify the po- 
tential modification by RT did not reveal stops at low dNTP 
concentrations, indicating that rat L-Snord1 1 5 does not guide 
G4980 2'-0-methylation in rat brain and HeLa cells at detect- 
able levels (supplementary fig. S'\A and S16, Supplementary 
Material online). 

The complementarity of L-Snord1 1 5 to 28 S rRNA theoret- 
ically extends up to 1 1 nt involving parts of the D-box element 
(fig. 4/4). This might destabilize the K-turn motif upon binding 
and, in turn, leading to dissociation of 15.5kDa protein. 
Interaction with 15.5kDa protein is required to recruit the 
core snoRNP proteins, including the 2'-0-methyltransferase, 
fibrillarin (Lafontaine and Tollervey 2000; Watkins, et al. 2000; 
Dragon et al. 2006). This might be one of several potential 
explanations as to why this relatively young snoRNA variant 
does not, despite the theoretical complementarity, target 
modification of G4980 in 28 S rRNA. Alternatively, the novel 
variant might be involved in chaperone-like functions to stim- 
ulate RNA folding, as suggested for other snoRNAs (Vitali et al. 
2003). 

There are snoRNAs whose target or functions are un- 
known, including those encoded by the Snord115 and 
Snord116 gene clusters in the PWS locus. However, in 
Snord 1 1 5 RNA, an 1 8 nt long complementary to the alterna- 
tively spliced exon Vb of the 5HT2c serotonin receptor pre- 
mRNA has been predicted (fig. 46) (Cavaille et al. 2000). The 
targeted region is also subject to enzymatic posttranscriptional 
A-to-l editing by two proteins termed "adenosine deaminase 
acting on RNA" (ADAR1 and ADAR2) (fig. 46) (Burns et al. 
1997; Vitali et al. 2005). The alternative splice site is located 
1 3 nt upstream from the predicted complementarity to 
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A cbox Dboxl D box 3 

5' GUCAAUGAUGA-34ntS-AAUCAUGCUCAAUAGGAUUACCCUGA - 55ntS— UGCGGGACUCAUCUGAGCUGCUCUGAUGCC 3' 
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Fig. 4.— L-Snord1 1 5 snoRNA and its putative RNA targets. {A) Potential base pairing between L-Snord1 1 5 RNA and the 3'-region of 28 S rRNA. The 
predicted 2'-0-methylated nucleotide (G4980) is shown in bold. Additional base pairings involving 3 nt of the D-box are indicated by dotted lines. (S) Putative 
base pairing of L-Snordl 15 RNA with exon Vb of 5HT2c pre-mRNA and exon IX of Gpr1 56 mRNA, respectively. Parts of the alternatively spliced exon Vb and 
A to I editing sites in 5HT2c pre-mRNA are indicated (A, B, E, C, D). 



SNORD1 15 RNA and leads to a truncated serotonin receptor 
(Cavaille et al. 2000). The E, C, and D editing sites on pre- 
mRNA overlap with the targeted region. Posttranscriptional A 
to I editing has been reported to decrease the efficiency of 
G-protein coupling and therefore generates 5HT2c receptor 
variants with reduced activity (Berg et al. 2001; Vitali et al. 

2005) . Hence, perfect base complementarities displayed by 
the antisense element of Snord115 RNA to the alternatively 
spliced and posttranscriptionally edited exon of 5HT2c pre- 
mRNA suggest a tempting model for regulation of serotonin 
receptor biogenesis by snoRNAs (Cavaille et al. 2000). 
Although, in vitro analysis suggested potential involvement 
of SNORD1 1 5 RNA to regulate alternative splicing and editing 
of 5HT2c pre-mRNA (Vitali et al. 2005; Kishore and Stamm 

2006) , in vivo confirmation remains elusive, thus far (Doe et al. 
2009). 

Similar to Snordl 1 5, the L-Snord1 1 5 RNA variant contains 
the antisense element to the 5HT2c pre-mRNA located in the 
5'-portion of snoRNA (fig. 4B). Computational searches to 
identify potential mRNA targets for the guide element located 
in the 3'-part of L-Snord 1 1 5 identified a 1 9 nt complementar- 
ity between a region directly adjacent to the D-box3 and pro- 
tein-coding exon 9 of the metabotropic glutamate receptor 
Gpr1 56 mRNA (fig. 46). Interestingly, both 5HT2c and Gpr1 56 
proteins are members of the G protein-coupled receptor 



family (Stam et al. 1994; Calver et al. 2003). However, until 
there is sound in vivo evidence for a functional interaction of 
Snordl 1 5 or L-Snord1 1 5 RNAs with mRNA targets, the com- 
plementarities should be considered fortuitous. 



Conclusion 

We identified a novel brain specific C/D-box snoRNA variant in 
the rat PWS locus. The potential to generate L-Snord1 1 5 RNA 
from one of the Snordl 1 5 copies hinges on sequences in two 
separate regions. An intronic sequence provided an alternative 
D-box motif, while the canonical snoRNA coding region ac- 
quired a crucial G67 to C67 transversion adjacent to the 
Snordl 15 canonical D-box. The latter change led to a slight 
destabilization of the K-turn motif formed between 5'-C-box 
and 3'-D-box regions of Snordl 15 RNA. The presence of an 
additional D-box region in the 3' flanking sequence provided 
nucleotides for an alternative K-turn formation. This structure 
is assembled between the Snordl 15 5'-C-box and the distal 
intronic D-box and is necessary to express the novel 
L-Snord1 1 5 RNA variant. However, the changes did not com- 
pletely abolish canonical Snordl 15 production. Instead, they 
resulted in a posttranscriptional processing equilibrium yield- 
ing Snordl 1 5 as well as the novel L-Snord1 1 5 RNA variant. 
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All snoRNAs encoded within the PWS-locus lack significant 
base complementarities to the classical rRNAs or snRNAs tar- 
gets (Cavaille et al. 2000). However, L-Snord115 exhibits a 
complementarity of 8-1 1 nt to the 3'-domain of 28 S rRNA. 
Experimental approaches failed to identify the corresponding 
2'-0-methylation at detectable levels. Despite proposals that 
members of Snordl 1 5 snoRNA family are involved in regula- 
tion of A -> I editing or alternative splicing, solid in vivo evi- 
dence is still lacking. In any event, the novel snoRNA variant is 
restricted to rat but absent in mouse, and hence, at most -25 
million years old. By analogy, when we studied Alu element 
exonizations out of introns dating back between 20 and 60 
Ma, many of them were lost again on their way through the 
various Old World, New World monkey, and Ape lineages 
(Krull et al. 2005). Once more, this analogy is not surprising 
as most such events initially are slightly deleterious or neutral 
and rarely more or less advantageous, and persistence of 
novel parts of existing genes is rather the exception than the 
rule. Significantly, older events such as exonizations of mam- 
malian-wide repetitive elements, exhibited evidence for puri- 
fying selection (Krull et al. 2007). 

However, despite the low odds, one should not underesti- 
mate the significance of exaptations of genetic novelties. For 
example, exaptation of a recombinase of a DNA transposon, 
perhaps was for many million years near neutral. Nevertheless, 
it was a key event for the evolution of the immune system in 
jawed vertebrates (Kapitonov and Jurka 2005). 

Here, we revealed a mechanism by which new isoforms of 
nonprotein coding RNAs evolve. Based on our current under- 
standing of snoRNA evolution, new members arise by cis- or 
trans-duplication of ancestral snoRNA genes (Vitali, et al. 
2003; Weber 2006; Zemann, et al. 2006; Schmitz et al. 
2008). Cis-duplications are considered to be generated by re- 
combination and lead to integration of new snoRNA copies 
into neighboring introns of the same host gene. Trans-dupli- 
cations are mediated by retroposition and result in random 
integration of snoRNAs retrotransposons. The mechanism of 
L-Snord1 1 5 generation is different from the above. It demon- 
strates that the corresponding pre-snoRNA structure during 
alternative posttranscriptional processing is subject to length 
variation resulting in extension or reduction of snoRNA se- 
quences. Based on our data, it is tempting to suggest that 
many of the known snoRNAs larger or smaller than the ca- 
nonical structures arose by similar mechanisms. In summary, 
our data demonstrate new aspects in nonprotein coding RNA 
evolution and biogenesis. 

Supplementary Material 

Supplementary figures S1-S2 and table S1 are available at 
Genome Biology and Evolution online (http://www.gbe. 
oxfordjournals.org/). 
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